Summarising text with a genetic algorithm-based sentence extraction
نویسندگان
چکیده
Automatic text summarisation has long been studied and used. The growth in the amount of information on the web results in more demands for automatic methods for text summarisation. Designing a system to produce human-quality summaries is difficult and therefore, many researchers have focused on sentence or paragraph extraction, which is a kind of summarisation. In this paper, we introduce a new method to make such extracts. GeneticAlgorithm (GA)-based sentence selection is used to make a summary, and once the summary is created, it is evaluated using a fitness function. The fitness function is based on three following factors: Readability Factor (RF), Cohesion Factor (CF) and Topic-Relation Factor (TRF). In this paper, we introduce these factors and discuss the Genetic Algorithm with the specific fitness function. Evaluation results are also shown and discussed in the paper.
منابع مشابه
Biogeography-Based Optimization Algorithm for Automatic Extractive Text Summarization
Given the increasing number of documents, sites, online sources, and the users’ desire to quickly access information, automatic textual summarization has caught the attention of many researchers in this field. Researchers have presented different methods for text summarization as well as a useful summary of those texts including relevant document sentences. This study select...
متن کاملMulti-document Summarization System: Using Fuzzy Logic and Genetic Algorithm
In the recent times, the requirement for generation of multi-document summary has gained a lot of attention among the researchers. Mostly, the text summarization technique uses the sentence extraction technique where the salient sentences in the multiple documents are extracted and presented as a summary. In our proposed system, we have developed a sentence extraction based automatic multi-docu...
متن کاملImprovement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination
Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality of the subsequent tasks. Chemical text from which data for named entity recognition is extracte...
متن کاملMsc in Speech and Language Processing Dissertation : Automatic Summarising Based on Sentence Extraction: a Statistical Approach
The present dissertation and project describes a system for automatic summarising of texts. Instead of generating abstracts, a hard NLP task of questionable e ectiveness, the system tries to identify the most important sentences of the original text, thus producing an extract. The proposed, corpus-based and statistical approach exploits several heuristics to determine the summary-worthiness of ...
متن کاملTowards an ANN-based Approach to Automatic Sentence Extraction of the Chinese Text
We propose an ANN based automatic sentence extraction approach in this paper. We discuss in detail how to select the features of a sentence and we also present the algorithms to compute the feature values. The experiment results show that the this approach is feasible in implementing an automatic Chinese text abstracting system.
متن کامل